Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add complete implementation of the classical PCA algorithm with covar… #10315

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nPr0nn
Copy link

@nPr0nn nPr0nn commented Nov 15, 2024

…iance matrix and power iteration with a very simple test file

Description

The current PCA implementation in the repository is incomplete, as outlined in issue #8724. To address this, I implemented a simplified version that computes the covariance matrix and applies the power iteration method to obtain the principal component. This approach produces results consistent with the scipy PCA implementation on the same input matrices.

This is my first time working with ggml, so I would greatly appreciate a review of my implementation to ensure it aligns with best practices for the library.

Additionally, I noticed that @jukofyork used the cross-covariance method in their Python implementation of control-vectors. Inspired by this, I included a utility function to compute the cross-covariance matrix, which could be helpful for anyone exploring that method in this context.

For now, I’ve implemented only the power iteration method since, from my understanding, the focus is on obtaining the eigenvector corresponding to the largest variance on most cases. But would be nice to implement QR decomposition to get all eigenvectors later on.

Tests

A basic test file is included in the cvectors folder and the Makefile as a sanity check. It might be worth considering moving these tests to the appropriate tests folder for consistency with the rest of the project.

…iance matrix and power iteration with a very simple test file
ggml_free(ctx);
}

static void compute_cross_covariance(struct pca_params &pca_params,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this normal that this function is unused?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I included this function based on @jukofyork's comments about cross-covariance in the issue discussion. While it’s not currently being utilized, it could be used in future improvements of the algorithm implementation

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I don't have time to test it yet, but will do another review soon.

examples/cvector-generator/vanilla_pca.hpp Outdated Show resolved Hide resolved
examples/cvector-generator/vanilla_pca.hpp Outdated Show resolved Hide resolved
examples/cvector-generator/cvector-generator.cpp Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
…e correct ctx_size and add GGML_ASSERT to check v_output
examples/cvector-generator/mini-tests/test-vanilla-pca.cpp Outdated Show resolved Hide resolved
examples/cvector-generator/mini-tests/test-vanilla-pca.cpp Outdated Show resolved Hide resolved
examples/cvector-generator/pca.hpp Outdated Show resolved Hide resolved
examples/cvector-generator/pca.hpp Outdated Show resolved Hide resolved
examples/cvector-generator/pca.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'll give a try with cvector-generator.cpp tomorrow and will merge after that.

ggml_status graph_status = ggml_backend_graph_compute(backend, gf);

// Get graph results (eigenvector and eigenvalue) and store it in b and eigenvalue
if(graph_status == GGML_STATUS_SUCCESS){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(graph_status == GGML_STATUS_SUCCESS){
if (graph_status == GGML_STATUS_SUCCESS) {

Some minor style fix

eigenvalue = (float)((float*) eigenvalue_tensor->data)[0];

// Check if the similarity is close enough to 1, if so we converged and should break
if(1 - similarity < pca_params.tolerance)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@ngxson
Copy link
Collaborator

ngxson commented Nov 20, 2024

I'm having "tensor buffer not set" while running cvector-generator:

  * frame #0: 0x0000000190862600 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000019089af70 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001907a7908 libsystem_c.dylib`abort + 128
    frame #3: 0x00000001000211f8 llama-cvector-generator`ggml_abort(file="ggml/src/ggml-backend.cpp", line=261, fmt=<unavailable>) at ggml.c:169:5 [opt]
    frame #4: 0x0000000100038104 llama-cvector-generator`ggml_backend_tensor_set(tensor=<unavailable>, data=<unavailable>, offset=<unavailable>, size=<unavailable>) at ggml-backend.cpp:261:5 [opt]
    frame #5: 0x00000001001d5d44 llama-cvector-generator`main at pca.hpp:321:9 [opt]
    frame #6: 0x00000001001d572c llama-cvector-generator`main(argc=<unavailable>, argv=<unavailable>) at cvector-generator.cpp:492:9 [opt]
    frame #7: 0x0000000190518274 dyld`start + 2840

The command that I used:

make llama-cvector-generator -j
./llama-cvector-generator -m ../models/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf -ngl 99

@slaren
Copy link
Collaborator

slaren commented Nov 20, 2024

This is likely caused by all the direct assignments of tensor->data with malloc. This not compatible with ggml-backend, the tensors need to be allocated into a buffer with ggml_backend_alloc_ctx_tensors or similar.

@slaren
Copy link
Collaborator

slaren commented Nov 20, 2024

While trying to reproduce this I found other problems:

  • covariance in run_single_pca is not big enough and causes a buffer overflow
  • The tensors saved in the gguf do not have names, so it fails when they are added to the gguf file

The last one is especially puzzling to me.. was this tested at all?

@nPr0nn
Copy link
Author

nPr0nn commented Nov 20, 2024

@ngxson Oof, I'll have to investigate to find out what is going wrong, probably during the weekend, when computing the pca on float matrices during my tests this didn't happen.

@slaren I don't think there is any assignment directly to tensor->data with malloc on the pca.hpp file anymore, I've changed to use ggml_backend_tensor_set and ggml_backend_tensor_get, maybe is something related with this ? Btw I don't understand why the covariance matrix would not be big enough

@slaren
Copy link
Collaborator

slaren commented Nov 20, 2024

The malloc are in cvector-generator.cpp. This is a possible way to solve that:

diff --git a/examples/cvector-generator/cvector-generator.cpp b/examples/cvector-generator/cvector-generator.cpp
index e7c924fb..17881c3a 100644
--- a/examples/cvector-generator/cvector-generator.cpp
+++ b/examples/cvector-generator/cvector-generator.cpp
@@ -2,6 +2,8 @@
 #include "common.h"
 #include "llama.h"
 #include "ggml.h"
+#include "ggml-cpp.h"
+#include "ggml-alloc.h"

 #include "mean.hpp"
 #include "pca.hpp"
@@ -193,6 +195,7 @@ struct train_context {
     // to easily re-alloc when concat v_diff, we temporary store v_diff in a vector instead of a tensor
     // v_diff_tmp will get converted unto v_diff later on
     std::vector<std::vector<uint8_t>> v_diff_tmp;
+    ggml_backend_buffer_ptr buffer;

     train_context(int n_embd_, int n_layers_) {
         n_embd = n_embd_;
@@ -207,9 +210,9 @@ struct train_context {
             std::vector<uint8_t> empty;
             v_diff_tmp.push_back(empty);
             auto t = ggml_new_tensor_1d(ctx_ggml, GGML_TYPE_F32, n_embd);
-            t->data = malloc(ggml_nbytes(t)); // TODO: get rid of malloc if possible
             v_final.push_back(t);
         }
+        buffer.reset(ggml_backend_alloc_ctx_tensors_from_buft(ctx_ggml, ggml_backend_cpu_buffer_type()));
     }

     // add new rows into existing tensor in v_diff_tmp

@slaren
Copy link
Collaborator

slaren commented Nov 20, 2024

Btw I don't understand why the covariance matrix would not be big enough

I don't know what's the size supposed to be, but if you build with LLAMA_SANITIZE_ADDRESS=1 you should see the error. It may depend on the model, I used the very small stories260k.gguf from https://huggingface.co/ggml-org/models/tree/main/tinyllamas

@nPr0nn
Copy link
Author

nPr0nn commented Nov 20, 2024

Oooh the mallocs are in cvector-generator.cpp, seems like this file will need to be refactored then

Thanks! I'll test everything on the weekend and try to reproduce the problem with these models and investigate why the covariance matrix is overflowing to solve it aswell

@ngxson
Copy link
Collaborator

ngxson commented Nov 21, 2024

Please note that some malloc was used in cvector-generator.cpp because some tensor size is unknown from the beginning (i.e. some depends on the number of input tokens)

There are other places that I simply use malloc to temporary store tensor data before writing it to GGUF.

Refactoring those would be nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants